Automatically classifying sleep stages using multi-channel electroencephalography (EEG) is a basic and essential problem in clinical neurophysiology, which is highly affected by the extreme inter-subject variability, serious class imbalance and sequential correlation between consecutive sleep epochs. In the present work, we put forward EEGSeqNet, a hybrid deep learning framework, which is made up of a 3-layer one-dimensional CNN for epoch-wise feature extraction, a 2-layer BiLSTM encoder for sequential context learning, and a 4-head Multi-Head Self-Attention mechanism designed to model long-range dependency on a context window of 15 consecutive sleep epochs of 30 seconds. It is trained with a sophisticated three-stage strategy, including one-cycle learning rate warm-up, focal-loss fine-tuning with varying learning rates, and Stochastic Weight Averaging (SWA) polish; and it is validated through subject-disjoint test sets of the Sleep-EDF Expanded (Sleep-Cassette) corpus. The proposed EEGSeqNet reaches 90.14% accuracy, 89.76% balanced accuracy, 89.75% macro F1-score and 98.77% macro ROC-AUC on the subject-disjoint test set, which achieves significantly better results than current state-of-the-art techniques, including MultiScale SleepNet (85.6% accuracy on Sleep-EDFx), XSleepNet2 (86.3%), SeqSleepNet (87.1%), AttnSleep (85.6%), DeepSleepNet (82.0%) etc. on comparable tasks. The excellent performance indicates that proper modeling of the sequential context by hybrid architecture and extensive regularization technique with multi-stage training is promising to generate high subject-disjoint generalization.
Introduction
This paper focuses on automatic sleep stage classification from EEG signals, a key problem in sleep medicine where traditional manual scoring of polysomnography (PSG) recordings is slow, labor-intensive, and often inconsistent between experts. Sleep is essential for memory, immune function, and cognitive health, and is typically analyzed in 30-second epochs classified into five stages: Wake, N1, N2, N3, and REM.
Challenges in Sleep Stage Classification
Automatic sleep staging is difficult due to:
Strong temporal dependencies between sleep stages (stages follow structured transitions)
Severe class imbalance (e.g., N2 dominates while N1 is rare)
High inter-subject variability in EEG patterns, making generalization difficult
Existing Approaches
Earlier deep learning models addressed parts of the problem:
CNN-based models extract spatial/temporal EEG features (e.g., DeepSleepNet, TinySleepNet)
Multi-stage training strategy to handle imbalance and improve stability
Strong use of augmentation to improve generalization
Subject-independent evaluation protocol
Results
The model achieves:
90.14% test accuracy
98.77% macro ROC-AUC
on subject-disjoint evaluation, indicating strong generalization performance across different individuals.
Conclusion
We presented EEGSeqNet, a hybrid CNN-BiLSTM-Attention architecture for automatic sleep stage classification from multi-channel EEG (EEG Fpz-Cz, Pz-Oz, and EOG), operating over a context window of 15 consecutive 30-second epochs. The model integrates: a three-layer 1D-CNN per-epoch feature extractor, a two-layer BiLSTM sequence encoder, and a four-head Multi-Head Self-Attention block. Trained with a three-stage curriculum (OneCycleLR warm-up, Focal Loss fine-tuning, SWA polishing) and evaluated under a strict subject-disjoint protocol, EEGSeqNet achieves a test accuracy of 90.14%, a balanced accuracy of 89.76%, a Macro F1-score of 89.75%, and a Macro ROC-AUC of 98.77% on the Sleep-EDF Expanded dataset. These results surpass recent state-of-the-art systems including MultiScaleSleepNet (85.6% on Sleep-EDFx), XSleepNet2 (86.3%), SeqSleepNet (87.1%), AttnSleep (85.6%), and DeepSleepNet (82.0%).
Future work will extend EEGSeqNet to EMG integration, clinical patient cohorts, cross-dataset transfer learning, and efficient edge-device deployment.
References
[1] K. Ramar et al., “Sleep is essential to health: An American Academy of Sleep Medicine position statement,” Journal of Clinical Sleep Medicine, vol. 17, no. 10, pp. 2115–2119, 2021.
[2] C. Iber, S. Ancoli-Israel, A. Chesson, and S. Quan, The AASM Manual for the Scoring of Sleep and Associated Events. American Academy of Sleep Medicine, 2007.
[3] H. Phan and K. Mikkelsen, “Automatic sleep staging of EEG signals: Recent development, challenges, and future directions,” Physiological Measurement, vol. 43, no. 4, p. 04TR01, 2022.
[4] A. Supratak, H. Dong, C. Wu, and Y. Guo, “DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 25, no. 11, pp. 1998–2008, 2017.
[5] E. Eldele, Z. Chen, C. Liu, M. Wu, C. Kwoh, X. Li, and C. Guan, “An attention-based deep learning approach for sleep stage classification with single-channel EEG,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 29, pp. 809–818, 2021.
[6] H. Phan, F. Andreotti, N. Cooray, O. Y. Chén, and M. De Vos, “SeqSleepNet: End-to-end hierarchical recurrent neural network for sequence-to-sequence automatic sleep staging,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 27, no. 3, pp. 400–410, 2019.
[7] H. Phan, O. Y. Chén, M. C. Tran, P. Koch, A. Mertins, and M. De Vos, “XSleepNet: Multi-view sequential model for automatic sleep staging,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 9, pp. 5903–5915, 2021.
[8] C. Liu, Q. Guan, W. Zhang, L. Sun, M. Wang, X. Dong, and S. Xu, “MultiScaleSleepNet: A hybrid CNN-BiLSTM-Transformer architecture with multi-scale feature representation for single-channel EEG sleep stage classification,” Sensors, vol. 25, no. 20, p. 6328, 2025.
[9] A. Supratak and Y. Guo, “TinySleepNet: An efficient deep learning model for sleep stage scoring based on raw single-channel EEG,” in Proc. 42nd IEEE EMBC, 2020, pp. 641–644.
[10] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proc. IEEE ICCV, 2017, pp. 2980–2988.
[11] P. Izmailov, D. Podoprikhin, T. Garipov, D. Vetrov, and A. G. Wilson, “Averaging weights leads to wider optima and better generalization,” in Proc. 34th UAI, 2018, pp. 876–885.
[12] A. L. Goldberger et al., “PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals,” Circulation, vol. 101, no. 23, pp. e215–e220, 2000.
[13] O. Tsinalis, P. M. Matthews, Y. Guo, and S. Zafeiriou, “Automatic sleep stage scoring with single-channel EEG using convolutional neural networks,” arXiv preprint arXiv:1610.01683, 2016.
[14] A. J. Viterbi, “Error bounds for convolutional codes and an asymptotically optimum decoding algorithm,” IEEE Trans. Inf. Theory, vol. 13, no. 2, pp. 260–269, 1967.
[15] Z. Jia, Y. Lin, J. Wang, X. Wang, P. Xie, and Y. Zhang, “SalientSleepNet: Multimodal salient wave detection network for sleep staging,” in Proc. 30th IJCAI, 2021, pp. 2614–2620.
[16] Y. Zheng, Y. Luo, B. Zou, L. Zhang, and L. Li, “MMASleepNet: A multimodal attention network based on electrophysiological signals for automatic sleep staging,” Frontiers in Neuroscience, vol. 16, p. 973761, 2022.